13 research outputs found
White, Man, and Highly Followed: Gender and Race Inequalities in Twitter
Social media is considered a democratic space in which people connect and
interact with each other regardless of their gender, race, or any other
demographic factor. Despite numerous efforts that explore demographic factors
in social media, it is still unclear whether social media perpetuates old
inequalities from the offline world. In this paper, we attempt to identify
gender and race of Twitter users located in U.S. using advanced image
processing algorithms from Face++. Then, we investigate how different
demographic groups (i.e. male/female, Asian/Black/White) connect with other. We
quantify to what extent one group follow and interact with each other and the
extent to which these connections and interactions reflect in inequalities in
Twitter. Our analysis shows that users identified as White and male tend to
attain higher positions in Twitter, in terms of the number of followers and
number of times in user's lists. We hope our effort can stimulate the
development of new theories of demographic information in the online space.Comment: In Proceedings of the IEEE/WIC/ACM International Conference on Web
Intelligence (WI'17). Leipzig, Germany. August 201
Recommended from our members
Query Optimizations for Deep Learning Systems
Deep Learning (DL) has unlocked unstructured data for analytics. It has enabled new applications, insights, and value in various domains, including enterprises, domain sciences, and healthcare. However, DL workloads are highly resource-intensive and time-consuming, which hinder their adoption. Thus, optimizing them from a systems standpoint has attracted significant attention in recent years. In this dissertation, we fundamentally re-imagine DL workloads as data processing workloads and optimize them from a data management standpoint. Using a combination of abstractions already available in DL practice, new algorithms, system design, theoretical and empirical analysis, we show how classical query optimization ideas such as rewrites, multi-query optimization, materialization optimization, incremental view maintenance, approximate query processing, and predicate push-down can be re-imagined in the context of DL workloads to optimize them. We show that our techniques can enable significant runtime and resource savings (even over 10X for some cases) for a variety of popular and important end-to-end DL workloads. Our work fills a critical technical gap in DL systems architecture and opens up new connections between query optimization and DL systems
Better Data Discoverability in Science Gateways
Science gateways primarily focused on remote job executionmanagement generate domain specific output data mainlyreadable by application specific parsers and post processing utilities. For example, computational chemistry data outputs encode molecule information, convergence of the simulation and energy values. Such domain-specific information is non-trivial to search in a generic fashion. It is thus desirable to add a wide range of application-specific and user-specific post-processing features that may include remote executions of scripts and smaller applications that don’t require scheduling on clusters. It is also desirable to support integrations with searching, indexing, and general purpose data analysis and mining tools provided by the Apache “big data” software stack. As gateways become tenants to general purpose platform services, providing a general purpose infrastructure that enables these application specific post-processing steps is an interesting architectural challenge. Furthermore, it is desirable to share results fromthe post-processing and indexing. In this paper, we discuss how we have incorporated a new automated application output indexing system for the SEAGrid Science Gateway using Apache Airavata that will parse and index generated output for easy querying. We also examine data sharing and automated data publication so that another user can reuse theresults without running an already executed experiment andhence reduce resource utilization
Gendered Conversation in a Social Game-Streaming Platform
Online social media and games are increasingly replacing offline social activities. Social media is now an indispensable mode of communication; online gaming is not only a genuine social activity but also a popular spectator sport. Although online interaction shrinks social and geographical barriers, it is argued that social disparities, such as gender inequality, persists. For instance, online gaming communities have been criticized for objectifying women, which is a pressing question as gaming evolves into a social platform. However, few large-scale, systematic studies of gender inequality and objectification in social gaming platforms exist. Here we analyze more than one billion chat messages from Twitch, a social game-streaming platform, to study how the gender of streamers is associated with the nature of conversation. We find that female streamers receive significantly more objectifying comments while male streamers receive more game- related comments. This difference is more pronounced for popular streamers. We also show that the viewers’ choice of channels is also strongly gendered. Our findings suggest that gendered conversation and objectification is prevalent, and most users produce strongly gendered messages
Tensors: an abstraction for general data processing
Deep Learning (DL) has created a growing demand for simpler ways to develop complex models and efficient ways to execute them. Thus, a significant effort has gone into frameworks like PyTorch or TensorFlow to support a variety of DL models and run efficiently and seamlessly over heterogeneous and distributed hardware. Since these frameworks will continue improving given the predominance of DL workloads, it is natural to ask what else can be done with them. This is not a trivial question since these frameworks are based on the efficient implementation of tensors, which are well adapted to DL but, in principle, to nothing else. In this paper we explore to what extent Tensor Computation Runtimes (TCRs) can support non-ML data processing applications, so that other use cases can take advantage of the investments made on TCRs. In particular, we are interested in graph processing and relational operators, two use cases very different from ML, in high demand, and complement quite well what TCRs can do today. Building on Hummingbird, a recent platform converting traditional machine learning algorithms to tensor computations, we explore how to map selected graph processing and relational operator algorithms into tensor computations. Our vision is supported by the results: our code often outperforms custom-built C++ and CUDA kernels, while massively reducing the development effort, taking advantage of the cross-platform compilation capabilities of TCRs.ISSN:2150-809
Using Keycloak for Gateway Authentication and Authorization
<div>
<div>
<div>
<div>
<p>Establishing users’ identities before
they access research infrastructure resources is a
key feature of science gateways. With many
science gateways now relying on general purpose
gateway platform services, the challenges of
managing identity-derived features have expanded
to include authorization between science gateway
tenants, middleware, and third party identity
provider services. The latter include campus
identity management systems. This paper examines
the use of Keycloak as an implementation of an
identity management system for Apache Airavata
middleware, replacing our previous WSO2 Identity
Server-based implementation. This effort raises
larger issues that software-as-a-service
communities should consider when embedding
dependencies on third party software and services,
including developing selection criteria and
future-proofing systems.
</p>
</div>
</div>
</div>
</div
Recommended from our members
Low movement, deep-learned sitting patterns, and sedentary behavior in the International Study of Childhood Obesity, Lifestyle and the Environment (ISCOLE).
BACKGROUND/OBJECTIVES: Sedentary behavior (SB) has both movement and postural components, but most SB research has only assessed low movement, especially in children. The purpose of this study was to compare estimates and health associations of SB when derived from a standard accelerometer cut-point, a novel sitting detection technique (CNN Hip Accelerometer Posture for Children; CHAP-Child), and both combined. METHODS: Data were from the International Study of Childhood Obesity, Lifestyle, and the Environment (ISCOLE). Participants were 6103 children (mean ± SD age 10.4 ± 0.56 years) from 12 countries who wore an ActiGraph GT3X+ accelerometer on the right hip for approximately one week. We calculated SB time, mean SB bout duration, and SB breaks using a cut-point (SBmovement), CHAP-Child (SBposture), and both methods combined (SBcombined). Mixed effects regression was used to test associations of SB variables with pediatric obesity variables (waist circumference, body fat percentage, and body mass index z-score). RESULTS: After adjusting for MVPA, SBposture showed several significant obesity associations favoring lower mean SB bout duration (b = 0.251-0.449; all p < 0.001) and higher SB breaks (b = -0.005--0.052; all p < 0.001). Lower total SB was unexpectedly related to greater obesity (b = -0.077--0.649; p from <0.001-0.02). For mean SB bout duration and SB breaks, more associations were observed for SBposture (n = 5) than for SBmovement (n = 3) or SBcombined (n = 1), and tended to have larger magnitude as well. CONCLUSIONS: Using traditional measures of low movement as a surrogate for SB may lead to underestimated or undetected adverse associations between SB and obesity. CHAP-Child allows assessment of sitting posture using hip-worn accelerometers. Ongoing work is needed to understand how low movement and posture are related to one another, as well as their potential health implications
Recommended from our members
Application of Convolutional Neural Network Algorithms for Advancing Sedentary and Activity Bout Classification.
BackgroundMachine learning has been used for classification of physical behavior bouts from hip-worn accelerometers; however, this research has been limited due to the challenges of directly observing and coding human behavior "in the wild." Deep learning algorithms, such as convolutional neural networks (CNNs), may offer better representation of data than other machine learning algorithms without the need for engineered features and may be better suited to dealing with free-living data. The purpose of this study was to develop a modeling pipeline for evaluation of a CNN model on a free-living data set and compare CNN inputs and results with the commonly used machine learning random forest and logistic regression algorithms.MethodTwenty-eight free-living women wore an ActiGraph GT3X+accelerometer on their right hip for 7 days. A concurrently worn thigh-mounted activPAL device captured ground truth activity labels. The authors evaluated logistic regression, random forest, and CNN models for classifying sitting, standing, and stepping bouts. The authors also assessed the benefit of performing feature engineering for this task.ResultsThe CNN classifier performed best (average balanced accuracy for bout classification of sitting, standing, and stepping was 84%) compared with the other methods (56% for logistic regression and 76% for random forest), even without performing any feature engineering.ConclusionUsing the recent advancements in deep neural networks, the authors showed that a CNN model can outperform other methods even without feature engineering. This has important implications for both the model's ability to deal with the complexity of free-living data and its potential transferability to new populations
The CNN Hip Accelerometer Posture (CHAP) Method for Classifying Sitting Patterns from Hip Accelerometers: A Validation Study.
IntroductionSitting patterns predict several healthy aging outcomes. These patterns can potentially be measured using hip-worn accelerometers, but current methods are limited by an inability to detect postural transitions. To overcome these limitations, we developed the Convolutional Neural Network Hip Accelerometer Posture (CHAP) classification method.MethodsCHAP was developed on 709 older adults who wore an ActiGraph GT3X+ accelerometer on the hip, with ground-truth sit/stand labels derived from concurrently worn thigh-worn activPAL inclinometers for up to 7 d. The CHAP method was compared with traditional cut-point methods of sitting pattern classification as well as a previous machine-learned algorithm (two-level behavior classification).ResultsFor minute-level sitting versus nonsitting classification, CHAP performed better (93% agreement with activPAL) than did other methods (74%-83% agreement). CHAP also outperformed other methods in its sensitivity to detecting sit-to-stand transitions: cut-point (73%), TLBC (26%), and CHAP (83%). CHAP's positive predictive value of capturing sit-to-stand transitions was also superior to other methods: cut-point (30%), TLBC (71%), and CHAP (83%). Day-level sitting pattern metrics, such as mean sitting bout duration, derived from CHAP did not differ significantly from activPAL, whereas other methods did: activPAL (15.4 min of mean sitting bout duration), CHAP (15.7 min), cut-point (9.4 min), and TLBC (49.4 min).ConclusionCHAP was the most accurate method for classifying sit-to-stand transitions and sitting patterns from free-living hip-worn accelerometer data in older adults. This promotes enhanced analysis of older adult movement data, resulting in more accurate measures of sitting patterns and opening the door for large-scale cohort studies into the effects of sitting patterns on healthy aging outcomes